Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract We consider inference problems for high-dimensional (HD) functional data with a dense number of T repeated measurements taken for a large number of p variables from a small number of n experimental units. The spatial and temporal dependence, high dimensionality, and dense number of repeated measurements pose theoretical and computational challenges. This paper has two aims; our first aim is to solve the theoretical and computational challenges in testing equivalence among covariance matrices from HD functional data. The second aim is to provide computationally efficient and tuning-free tools with guaranteed stochastic error control. The weak convergence of the stochastic process formed by the test statistics is established under the “large p, large T, and small n” setting. If the null is rejected, we further show that the locations of the change points can be estimated consistently. The estimator's rate of convergence is shown to depend on the data dimension, sample size, number of repeated measurements, and signal-to-noise ratio. We also show that our proposed computation algorithms can significantly reduce the computation time and are applicable to real-world data with a large number of HD-repeated measurements (e.g., functional magnetic resonance imaging (fMRI) data). Simulation results demonstrate both the finite sample performance and computational effectiveness of our proposed procedures. We observe that the empirical size of the test is well controlled at the nominal level, and the locations of multiple change points can be accurately identified. An application to fMRI data demonstrates that our proposed methods can identify event boundaries in the preface of the television series Sherlock. Code to implement the procedures is available in an R package named TechPhD.more » « less
-
Summary This paper deals with the detection and identification of changepoints among covariances of high-dimensional longitudinal data, where the number of features is greater than both the sample size and the number of repeated measurements. The proposed methods are applicable under general temporal-spatial dependence. A new test statistic is introduced for changepoint detection, and its asymptotic distribution is established. If a changepoint is detected, an estimate of the location is provided. The rate of convergence of the estimator is shown to depend on the data dimension, sample size, and signal-to-noise ratio. Binary segmentation is used to estimate the locations of possibly multiple changepoints, and the corresponding estimator is shown to be consistent under mild conditions. Simulation studies provide the empirical size and power of the proposed test and the accuracy of the changepoint estimator. An application to a time-course microarray dataset identifies gene sets with significant gene interaction changes over time.more » « less
An official website of the United States government
